Python: Add Pandas SQLi sinks#19594
Merged
tausbn merged 4 commits intogithub:mainfrom Jun 2, 2025
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR introduces explicit SQL injection modeling for pandas.read_sql and pandas.read_sql_query, updates existing tests to use an sqlite3 connection for sink detection, and adds a change note entry.
- Updated pandas DataFrame test to connect via
sqlite3and mark SQL sinks with$getSql. - Added a
ReadSQLCallQL class to flag raw SQL queries in Pandas. - Documented the new analysis in the change notes.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/ql/test/library-tests/frameworks/pandas/dataframe_query.py | Replaced path-based calls with sqlite3.connect, updated sink markers |
| python/ql/src/change-notes/2025-05-26-pandas-sqli-sinks.md | Added entry for Pandas SQLi sinks |
| python/ql/lib/semmle/python/frameworks/Pandas.qll | Introduced ReadSQLCall to model read_sql and read_sql_query as sinks |
Comments suppressed due to low confidence (2)
python/ql/test/library-tests/frameworks/pandas/dataframe_query.py:59
- This test covers positional arguments but does not cover keyword argument usage (e.g.,
sql=andcon=). Adding a test forpd.read_sql_query(sql="...", con=connection)would ensure named-parameter sinks are detected.
df = pd.read_sql_query("sql query", connection) # $getSql="sql query"
python/ql/lib/semmle/python/frameworks/Pandas.qll:161
- Unconditionally treating
read_sqlas a sink may flag saferead_sql_tablecalls (whichread_sqlcan dispatch to). Consider refining the model to only treat calls that route toread_sql_queryor inspect argument patterns to avoid false positives.
ReadSQLCall() { this = API::moduleImport("pandas").getMember(["read_sql", "read_sql_query"]).getACall() }
tausbn
previously approved these changes
May 27, 2025
Contributor
tausbn
left a comment
There was a problem hiding this comment.
Short and sweet! Looks good to me. 👍
Contributor
Author
|
Thanks @tausbn ! I fixed some nitpicks from automation :) |
tausbn
approved these changes
May 27, 2025
Contributor
tausbn
left a comment
There was a problem hiding this comment.
Thank you for the additional fixes! 🙂
Contributor
Author
|
Hi @tausbn I don't have the rights to merge, this one is ready from my side :) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pandas has a
read_sqlsink, which allows for executing raw SQL queries with untrusted input.read_sqlis a convenience method, that dispatches the query either toread_sql_tableorread_sql_query. Please note thatread_sql_tableis not vulnerable to SQLi, butread_sql_queryis.Note that instead of adding new tests, I edited the previous tests, because they contained the vulnerable
read_sqlandread_sql_querymethod, which made them show up as failed.